Information content and analysis methods for Multi-Modal High-Throughput Biomedical Data
نویسندگان
چکیده
The spectrum of modern molecular high-throughput assaying includes diverse technologies such as microarray gene expression, miRNA expression, proteomics, DNA methylation, among many others. Now that these technologies have matured and become increasingly accessible, the next frontier is to collect "multi-modal" data for the same set of subjects and conduct integrative, multi-level analyses. While multi-modal data does contain distinct biological information that can be useful for answering complex biology questions, its value for predicting clinical phenotypes and contributions of each type of input remain unknown. We obtained 47 datasets/predictive tasks that in total span over 9 data modalities and executed analytic experiments for predicting various clinical phenotypes and outcomes. First, we analyzed each modality separately using uni-modal approaches based on several state-of-the-art supervised classification and feature selection methods. Then, we applied integrative multi-modal classification techniques. We have found that gene expression is the most predictively informative modality. Other modalities such as protein expression, miRNA expression, and DNA methylation also provide highly predictive results, which are often statistically comparable but not superior to gene expression data. Integrative multi-modal analyses generally do not increase predictive signal compared to gene expression data.
منابع مشابه
Optimized co-registration method of Spinal cord MR Neuroimaging data analysis and application for generating multi-parameter maps
Introduction: The purpose of multimodal and co-registration In MR Neuroimaging is to fuse two or more sets images (T1, T2, fMRI, DTI, pMRI, …) for combining the different information into a composite correlated data set in order to visualization, re-alignment and generating transform to functional Matrix. Multimodal registration and motion correction in spinal cord MR Neuroimag...
متن کاملA non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data
MOTIVATION Recent advances in high-throughput omics technologies have enabled biomedical researchers to collect large-scale genomic data. As a consequence, there has been growing interest in developing methods to integrate such data to obtain deeper insights regarding the underlying biological system. A key challenge for integrative studies is the heterogeneity present in the different omics da...
متن کاملFusing Biomedical Multi-modal Data for Exploratory Data Analysis
Data analysis in modern biomedical research has to integrate data from different sources, like microarray, clinical and categorical data, so called multi-modal data. The reef SOM, a metaphoric display, is applied and further improved such that it allows the simultaneous display of biomedical multi-modal data for an exploratory analysis. Visualizations of microarray, clinical, and category data ...
متن کاملKnowledge extraction and data mining algorithms for complex biomedical data
Due to the advances of high throughput technologies for data acquisition in biomedicine an increasing amount of data is produced. This thesis proposes innovative algorithms for data mining on biomedical data ranging from clustering via semi-supervised clustering to classification. Clustering aims at deriving a previously unknown grouping of a data set and can be applied e.g. to investigate diff...
متن کاملMultivariate Multi-Way Modelling of Multiple High-Dimensional Data Sources
Aalto University, P.O. Box 11000, FI-00076 Aalto www.aalto.fi Author Ilkka Huopaniemi Name of the doctoral dissertation Multivariate Multi-Way Modelling of Multiple High-Dimensional Data Sources Publisher School of Science Unit Department of Information and Computer Science Series Aalto University publication series DOCTORAL DISSERTATIONS 117/2012 Field of research Computer and Information Scie...
متن کامل